{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qjaFmAyEzcLz",
    "papermill": {
     "duration": 0.008106,
     "end_time": "2021-04-23T10:41:33.911944",
     "exception": false,
     "start_time": "2021-04-23T10:41:33.903838",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Double/Debiased ML for Partially Linear IV Model\n",
    "\n",
    "This is a simple implementation of Debiased Machine Learning for the Partially Linear\n",
    "IV Regression Model, which provides an application of DML IV inference.\n",
    "\n",
    "\n",
    "Reference:\n",
    "\n",
    "- https://arxiv.org/abs/1608.00060\n",
    "- https://www.amazon.com/Business-Data-Science-Combining-Accelerate/dp/1260452778\n",
    "\n",
    "The code is based on the book.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "G3YNyr-ezcL4",
    "papermill": {
     "duration": 0.006963,
     "end_time": "2021-04-23T10:41:33.926085",
     "exception": false,
     "start_time": "2021-04-23T10:41:33.919122",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "\n",
    "# Partially Linear IV Model\n",
    "\n",
    "We consider the partially linear structural equation model:\n",
    "\\begin{eqnarray}\n",
    " &  Y - D\\theta_0 = g_0(X) + \\zeta,  & E[\\zeta \\mid Z,X]= 0,\\\\\n",
    "  & Z = m_0(X) +  V,   &  E[V \\mid X] = 0.\n",
    "\\end{eqnarray}\n",
    "\n",
    "\n",
    "Note that this model is not a regression model unless $Z=D$.  The model  is a canonical\n",
    "model in causal inference, going back to P. Wright's work on IV methods for estimating demand/supply equations, with the modern difference being that $g_0$ and $m_0$ are nonlinear, potentially complicated functions of high-dimensional $X$.  \n",
    "\n",
    "\n",
    "The idea of this model is that there is a structural or causal relation between $Y$ and $D$, captured by $\\theta_0$, and $g_0(X) + \\zeta$ is the stochastic error, partly explained by covariates $X$.  $V$ and $\\zeta$ are stochastic errors that are not explained by $X$. Since $Y$ and $D$ are jointly determined, we need an external factor, commonly referred to as an instrument, $Z$ to create exogenous variation\n",
    "in $D$.   Note that $Z$ should affect $D$.  The $X$ here serve again as confounding factors, so we can think of variation in $Z$ as being exogenous only conditional on $X$.\n",
    "\n",
    "\n",
    "The causal DAG this model corresponds to is given by:\n",
    "$$\n",
    "Z \\to D,  X \\to (Y, Z, D),  L \\to (Y,D),\n",
    "$$\n",
    "where $L$ is the latent confounder affecting both $Y$ and $D$, but not $Z$.\n",
    "\n",
    "\n",
    "\n",
    "---\n",
    "\n",
    "# Example\n",
    "\n",
    "A simple contextual example is from biostatistics, where $Y$ is a health outcome and $D$ is indicator of smoking.  Thus, $\\theta_0$ is captures the effect of smoking on health.  Health outcome $Y$ and smoking behavior $D$ are treated as being jointly determined.  $X$ represents patient characteristics, and $Z$ could be a doctor's advice not to smoke (or another behavioral treatment) that may affect the outcome $Y$ only through shifting the behavior $D$, conditional on characteristics $X$.   \n",
    "\n",
    "----\n",
    "\n",
    "\n",
    "\n",
    "# PLIVM in Residualized Form\n",
    "\n",
    "The PLIV model above can be rewritten in the following residualized form:\n",
    "$$\n",
    "  \\tilde Y = \\tilde D \\theta_0 + \\zeta,   \\quad  E[\\zeta \\mid V,X]= 0,\n",
    "$$\n",
    "where\n",
    "$$\n",
    " \\tilde Y = (Y- \\ell_0(X)),  \\quad \\ell_0(X) = E[Y \\mid X] \\\\\n",
    "   \\tilde D = (D - r_0(X)), \\quad r_0(X) = E[D \\mid X] \\\\\n",
    "   \\tilde Z = (Z- m_0(X)), \\quad m_0(X) = E[Z \\mid X].\n",
    "$$\n",
    "   The \"tilde\" variables (e.g. $\\tilde Y$) above represent original variables after taking out or \"partialling out\"\n",
    "  the effect of $X$.  Note that $\\theta_0$ is identified from this equation if $V$\n",
    "  and $U$ have non-zero correlation, which automatically means that $U$ and $V$\n",
    "  must have non-zero variation.\n",
    "\n",
    "  \n",
    "\n",
    "-----\n",
    "\n",
    "# DML for PLIV Model\n",
    "\n",
    "Given identification, DML  proceeds as follows\n",
    "\n",
    "Compute the estimates $\\hat \\ell_0$, $\\hat r_0$, and $\\hat m_0$ , which amounts\n",
    "to solving the three problems of predicting $Y$, $D$, and $Z$ using\n",
    "$X$, using any generic  ML method, giving us estimated residuals\n",
    "$$\n",
    "\\tilde Y = Y - \\hat \\ell_0(X), \\\\ \\tilde D= D - \\hat r_0(X), \\\\ \\tilde Z = Z- \\hat m_0(X).\n",
    "$$\n",
    "The estimates should be of a cross-validated form, as detailed in the algorithm below.\n",
    "\n",
    "Estimate $\\theta_0$ by the the intstrumental\n",
    "variable regression of $\\tilde Y$ on $\\tilde D$ using $\\tilde Z$ as an instrument.\n",
    "Use the conventional inference for the IV regression estimator, ignoring\n",
    "the estimation error in these residuals.\n",
    "\n",
    "The reason we work with this residualized form is that it eliminates the bias\n",
    "arising when solving the prediction problem in stage 1. The role of cross-validation\n",
    "is to avoid another source of bias due to potential overfitting.\n",
    "\n",
    "The estimator is adaptive,\n",
    "in the sense that the first stage estimation errors do not affect the second\n",
    "stage errors.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_kg_hide-output": true,
    "id": "yGW6JhG5zcL5",
    "papermill": {
     "duration": 34.670095,
     "end_time": "2021-04-23T10:42:08.603197",
     "exception": false,
     "start_time": "2021-04-23T10:41:33.933102",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Import relevant packages\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from sklearn.model_selection import cross_val_predict, KFold\n",
    "from sklearn.pipeline import make_pipeline\n",
    "from sklearn.preprocessing import StandardScaler\n",
    "from sklearn.ensemble import RandomForestRegressor\n",
    "import patsy\n",
    "import warnings\n",
    "from sklearn.base import BaseEstimator\n",
    "import statsmodels.api as sm\n",
    "from statsmodels.tools import add_constant\n",
    "warnings.simplefilter('ignore')\n",
    "np.random.seed(1234)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "j2WVUbBDzcL-",
    "papermill": {
     "duration": 1.846109,
     "end_time": "2021-04-23T10:42:10.458406",
     "exception": false,
     "start_time": "2021-04-23T10:42:08.612297",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "def dml(X, Z, D, y, modely, modeld, modelz, *, nfolds, classifier=False):\n",
    "    '''\n",
    "    DML for the Partially Linear Model setting with cross-fitting\n",
    "\n",
    "    Input\n",
    "    -----\n",
    "    X: the controls\n",
    "    Z: the instrument\n",
    "    D: the treatment\n",
    "    y: the outcome\n",
    "    modely: the ML model for predicting the outcome y\n",
    "    modeld: the ML model for predicting the treatment D\n",
    "    modelz: the ML model for predicting the instrument Z\n",
    "    nfolds: the number of folds in cross-fitting\n",
    "    classifier: bool, whether the modeld is a classifier or a regressor\n",
    "\n",
    "    Output\n",
    "    ------\n",
    "    point: the point estimate of the treatment effect of D on y\n",
    "    stderr: the standard error of the treatment effect\n",
    "    yhat: the cross-fitted predictions for the outcome y\n",
    "    Dhat: the cross-fitted predictions for the treatment D\n",
    "    Zhat: the cross-fitted predictions for the instrument Z\n",
    "    resy: the outcome residuals\n",
    "    resD: the treatment residuals\n",
    "    resZ: the instrument residuals\n",
    "    epsilon: the final residual-on-residual OLS regression residual\n",
    "    '''\n",
    "    cv = KFold(n_splits=nfolds, shuffle=True, random_state=123)  # shuffled k-folds\n",
    "    yhat = cross_val_predict(modely, X, y, cv=cv, n_jobs=-1)  # out-of-fold predictions for y\n",
    "    # out-of-fold predictions for D\n",
    "    # use predict or predict_proba dependent on classifier or regressor for D\n",
    "    if classifier:\n",
    "        Dhat = cross_val_predict(modeld, X, D, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
    "        Zhat = cross_val_predict(modelz, X, Z, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
    "    else:\n",
    "        Dhat = cross_val_predict(modeld, X, D, cv=cv, n_jobs=-1)\n",
    "        Zhat = cross_val_predict(modelz, X, Z, cv=cv, n_jobs=-1)\n",
    "    # calculate outcome and treatment residuals\n",
    "    resy = y - yhat\n",
    "    resD = D - Dhat\n",
    "    resZ = Z - Zhat\n",
    "    # rename for downstream tasks\n",
    "    resy = resy.rename(\"resy\")\n",
    "    resD = resD.rename(\"resD\")\n",
    "    resZ = resZ.rename(\"resZ\")\n",
    "\n",
    "    # final stage ols based point estimate and standard error\n",
    "    point = np.mean(resy * resZ) / np.mean(resD * resZ)\n",
    "    epsilon = resy - point * resD\n",
    "    var = np.mean(epsilon**2 * resZ**2) / np.mean(resD * resZ)**2\n",
    "    stderr = np.sqrt(var / X.shape[0])\n",
    "    return point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "F6U6fnzne-77"
   },
   "outputs": [],
   "source": [
    "def summary(point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon, X, D, Z, y, *, name):\n",
    "    '''\n",
    "    Convenience summary function that takes the results of the DML IV function\n",
    "    and summarizes several estimation quantities and performance metrics.\n",
    "    '''\n",
    "    return pd.DataFrame({'estimate': point,  # point estimate\n",
    "                         'stderr': stderr,  # standard error\n",
    "                         'lower': point - 1.96 * stderr,  # lower end of 95% confidence interval\n",
    "                         'upper': point + 1.96 * stderr,  # upper end of 95% confidence interval\n",
    "                         'rmse y': np.sqrt(np.mean(resy**2)),  # RMSE of model that predicts outcome y\n",
    "                         'rmse Z': np.sqrt(np.mean(resZ**2)),  # RMSE of model that predicts instrument Z\n",
    "                         'rmse D': np.sqrt(np.mean(resD**2))  # RMSE of model that predicts treatment D\n",
    "                         }, index=[name])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "x1g3XjsIzcL_",
    "papermill": {
     "duration": 0.011698,
     "end_time": "2021-04-23T10:42:10.482689",
     "exception": false,
     "start_time": "2021-04-23T10:42:10.470991",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "-----\n",
    "\n",
    "\n",
    "# Emprical Example:  Acemoglu, Johnson, Robinson (AER).\n",
    "\n",
    "* Y is log GDP;\n",
    "* D is a measure of Protection from Expropriation, a proxy for\n",
    "quality of insitutions;\n",
    "* Z is the log of Settler's mortality;\n",
    "* W are geographical variables (latitude, latitude squared, continent dummies as well as interactions)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "0Pc6OCp24rji"
   },
   "outputs": [],
   "source": [
    "file = \"https://raw.githubusercontent.com/CausalAIBook/MetricsMLNotebooks/main/data/AJR.csv\"\n",
    "data = pd.read_csv(file)\n",
    "data.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3rGw2hAqSx2Y"
   },
   "source": [
    "Run the following command to install hdmpy for rigorous lasso:\n",
    "\n",
    "``` !pip install multiprocess ```\n",
    "\n",
    "\n",
    "```!git clone https://github.com/maxhuppertz/hdmpy.git ```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "O5Rknnz2Rh9O"
   },
   "outputs": [],
   "source": [
    "!pip install multiprocess\n",
    "!git clone https://github.com/maxhuppertz/hdmpy.git"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Kb9N0c5uS47n"
   },
   "outputs": [],
   "source": [
    "import hdmpy\n",
    "\n",
    "\n",
    "class RLasso(BaseEstimator):\n",
    "\n",
    "    def __init__(self, *, post=True):\n",
    "        self.post = post\n",
    "\n",
    "    def fit(self, X, y):\n",
    "        self.rlasso_ = hdmpy.rlasso(X, y, post=self.post)\n",
    "        return self\n",
    "\n",
    "    def predict(self, X):\n",
    "        return np.array(X) @ np.array(self.rlasso_.est['beta']).flatten() + np.array(self.rlasso_.est['intercept'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "nO6a6tJWS-FO"
   },
   "outputs": [],
   "source": [
    "y = data['GDP']\n",
    "d = data['Exprop']\n",
    "z = data['logMort']\n",
    "\n",
    "controls = data.drop(['GDP', 'Exprop', 'logMort'], axis=1)\n",
    "x = patsy.dmatrix('-1 + (Latitude + Latitude2 + Africa + Asia + Namer + Samer)**2',\n",
    "                  data=controls, return_type='dataframe')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "kpPe250uelE7"
   },
   "outputs": [],
   "source": [
    "# DML with Post-Lasso:\n",
    "modely = make_pipeline(StandardScaler(), RLasso(post=True))\n",
    "modeld = make_pipeline(StandardScaler(), RLasso(post=True))\n",
    "modelz = make_pipeline(StandardScaler(), RLasso(post=True))\n",
    "\n",
    "# Run DML model with nfolds folds of cross-fitting\n",
    "res_rlasso = dml(x, z, d, y, modely, modeld, modelz, nfolds=5)\n",
    "table_rlasso = summary(*res_rlasso, x, d, z, y, name='RLasso')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "pdKuGFpXglp8"
   },
   "outputs": [],
   "source": [
    "# DML with Random Forests. RFs don't require scaling but we do it for consistency\n",
    "modely = make_pipeline(StandardScaler(),\n",
    "                       RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
    "modeld = make_pipeline(StandardScaler(),\n",
    "                       RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
    "modelz = make_pipeline(StandardScaler(),\n",
    "                       RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
    "\n",
    "# Run DML model with nfolds folds of cross-fitting (computationally intensive)\n",
    "res_RF = dml(x, z, d, y, modely, modeld, modelz, nfolds=5)\n",
    "table_RF = summary(*res_RF, x, d, z, y, name='RF')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ijo8p0Moig_F"
   },
   "outputs": [],
   "source": [
    "results = pd.concat([table_rlasso, table_RF], axis=0)\n",
    "results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "r7l6cFi2zcMA",
    "papermill": {
     "duration": 0.013703,
     "end_time": "2021-04-23T10:42:16.269251",
     "exception": false,
     "start_time": "2021-04-23T10:42:16.255548",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Weak Instruments?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rsUnPDfpzcMB",
    "papermill": {
     "duration": 16.107321,
     "end_time": "2021-04-23T10:42:32.390552",
     "exception": false,
     "start_time": "2021-04-23T10:42:16.283231",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Summary for RLasso\n",
    "resy_rlasso, resD_rlasso, resZ_rlasso = res_rlasso[5], res_rlasso[6], res_rlasso[7]\n",
    "# Using HC3 robust standard errors\n",
    "regDZ_rlasso = sm.OLS(resD_rlasso, add_constant(resZ_rlasso)).fit(cov_type='HC3', use_t=True)\n",
    "print(regDZ_rlasso.summary())\n",
    "\n",
    "# Summary for RF\n",
    "resy_RF, resD_RF, resZ_RF = res_RF[5], res_RF[6], res_RF[7]\n",
    "# Using HC3 robust standard errors\n",
    "regDZ_RF = sm.OLS(resD_RF, add_constant(resZ_RF)).fit(cov_type='HC3', use_t=True)\n",
    "print(regDZ_RF.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "L3fDDOmfzcMD",
    "papermill": {
     "duration": 0.015919,
     "end_time": "2021-04-23T10:42:32.423865",
     "exception": false,
     "start_time": "2021-04-23T10:42:32.407946",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## Anderson-Rubin Idea for Inference with Weak Instruments\n",
    "\n",
    "As shown above, we may have weak instruments because the t-stat in the regression $\\tilde D \\sim \\tilde Z$ is small relative to standard rules of thumb -- for example Stock and Yogo (2005) suggest accounting for weak instruments if the first stage F-statistic is less than 10 (and more recent work suggests even larger cutoffs).\n",
    "\n",
    "\n",
    " Here, we consider one specific approach (from Anderson and Rubin (1949)) to doing valid inference under weak identification based upon the statistic:\n",
    "$$\n",
    "C(\\theta) = \\frac{ |E_n [(\\tilde Y -  \\theta  \\tilde D) \\tilde Z]|^2}{ \\mathbb{V}_n [(\\tilde Y -  \\theta \\tilde  D) \\tilde Z ]/n}.\n",
    "$$\n",
    "The empirical variance $\\mathbb{V}_n$ is defined as:\n",
    "\\begin{align*}\n",
    "\\mathbb{V}_{n}[ g(W_i)] &:=   \\mathbb{E}_{n}g(W_i)g(W_i)' - \\mathbb{E}_{n}[ g(W_i)]\\mathbb{E}_{n}[ g(W_i)]'.\n",
    "\\end{align*}\n",
    "\n",
    "If $\\theta_0 = \\theta$, then $C(\\theta) \\overset{a}{\\sim} N(0,1)^2 = \\chi^2(1)$. Therefore,  we can reject the hypothesis $\\theta_0 = \\theta$ at level $a$ (for example $a = .05$ for a 5\\% level test)  if $C(\\theta)> c(1-a)$ where $c(1-a)$ is the $(1-a)$- quantile of a $\\chi^2(1)$ variable.  The probability of falsely rejecting the true hypothesis is approximately $a \\times 100\\%$.  \n",
    "To construct a $(1-a) \\times 100$\\% confidence region for $\\theta$, we can then simply invert the test by collecting all parameter values that are not rejected at the $a$ level:\n",
    "$$\n",
    "CR(\\theta) = \\{ \\theta \\in \\Theta: C(\\theta)  \\leq c(1-a)\\}.\n",
    "$$\n",
    "\n",
    "\n",
    "In more complex settings with many controls or controls that enter with unknown functional form, we can simply replace the residuals $\\tilde Y$, $\\tilde D$, and $\\tilde Z$ by machine\n",
    "learned cross-fitted residuals $\\check Y$, $\\check D$, and $ \\check Z$.  Thanks to the orthogonality of the IV moment condition underlying the formulation outlined above, we can formally assert that the properties of $C(\\theta)$ and the subsequent testing procedure and confidence region for $\\theta$ continue to hold when using cross-fitted residuals. We will further be able to apply the general procedure to cases where $D$\n",
    "is a vector, with a suitable adjustment of the statistic $C(\\theta)$.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PG2tPuFzzcMD",
    "papermill": {
     "duration": 0.015943,
     "end_time": "2021-04-23T10:42:32.455719",
     "exception": false,
     "start_time": "2021-04-23T10:42:32.439776",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "So let's carry out DML inference combined with Anderson-Rubin Idea"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "6weyaIdJ3pWs"
   },
   "outputs": [],
   "source": [
    "from scipy.stats import chi2\n",
    "\n",
    "\n",
    "def DML_AR_PLIV(rY, rD, rZ, grid, alpha=0.05):\n",
    "    n = len(rY)\n",
    "    Cstat = np.zeros(len(grid))\n",
    "\n",
    "    for i in range(len(grid)):\n",
    "        term1 = (rY - grid[i] * rD) * rZ\n",
    "        Cstat[i] = n * (np.mean(term1))**2 / np.var(term1)\n",
    "\n",
    "    Cstat = np.nan_to_num(Cstat)  # Replace any NaNs with zeros\n",
    "\n",
    "    lower_bound = min(grid[Cstat < chi2.ppf(1 - alpha, 1)])\n",
    "    upper_bound = max(grid[Cstat < chi2.ppf(1 - alpha, 1)])\n",
    "\n",
    "    plt.plot(grid, Cstat, linestyle='-', color='black')\n",
    "    plt.axhline(y=chi2.ppf(1 - alpha, 1), linestyle='--', color='blue')\n",
    "    plt.axvline(x=lower_bound, linestyle='--', color='red')\n",
    "    plt.axvline(x=upper_bound, linestyle='--', color='red')\n",
    "    plt.xlabel('Effect of institutions')\n",
    "    plt.ylabel('Statistic')\n",
    "    plt.title('')\n",
    "    plt.show()\n",
    "\n",
    "    return {'UB': upper_bound, 'LB': lower_bound}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "k9bB2O13zcME",
    "papermill": {
     "duration": 0.478528,
     "end_time": "2021-04-23T10:42:33.002469",
     "exception": false,
     "start_time": "2021-04-23T10:42:32.523941",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Using RLasso\n",
    "dml_ar_rlasso = DML_AR_PLIV(rY=resy_rlasso, rD=resD_rlasso, rZ=resZ_rlasso, grid=np.arange(-2, 2, 0.01))\n",
    "print(\"RLasso UB and LB: \", dml_ar_rlasso)\n",
    "# Using RF\n",
    "dml_ar_RF = DML_AR_PLIV(rY=resy_RF, rD=resD_RF, rZ=resZ_RF, grid=np.arange(-2, 2, 0.01))\n",
    "print(\"RF UB and LB: \", dml_ar_RF)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  },
  "papermill": {
   "duration": 62.79319,
   "end_time": "2021-04-23T10:42:34.038582",
   "environment_variables": {},
   "exception": null,
   "input_path": "__notebook__.ipynb",
   "output_path": "__notebook__.ipynb",
   "parameters": {},
   "start_time": "2021-04-23T10:41:31.245392",
   "version": "2.1.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
